Are All Linear Regions Created Equal?

This repository contains the code to reproduce the experiments from the paper "Are All Linear Regions Created Equal?".

Set up

To install all required dependencies, run:

pip install -r requirements.txt

Configure environment

To configure the computing environment for running the experiments, you can set the following environment variables

E_SAVE_DIR="/path/to/savedir" # basedir where to store results
E_NAME="exp_name" # name of the experiment run
E_DATA_DIR="/path/to/dataset/dir" # path where to load data from
E_DEVICE="cuda" # "cpu" or "cuda"
E_WORKERS=1 # number of CPU worker processes

or use the following command line arguments with train.py, compute_stats.py and aggregate_stats.py:

 --e_save-dir="/path/to/savedir"
 --e_name="exp_name"
 --e_data-dir="/path/to/dataset/dir"
 --e_device="cuda"
 --e_workers="1"

Train a model

To train a model, run

python train.py --e_device=cuda --e_name='test' --e_save-dir='checkpoints' --e_data-dir="./data" --e_workers=4 --data="cifar10" --model="vgg8" --epochs=300 --batch-size=128 --augmentation --seed=42 --train-split=49000 --val-split=1000 --eval-every=10

model checkpoints for the corresponding run will be compressed to a single zip archive, and stored to ./checkpoints/test/vgg8/cifar10/augmentation/seed-42/checkpoints.zip

Compute statistics

To estimate linear region density and absolute deviation along data-driven paths, checkpoints of trained models can be loaded, by specifying a network architecture, dataset, training seed, checkpoint and dataset split:

python compute_stats.py --e_device="cuda" --e_name=E_NAME --e_save-dir=E_SAVE_DIR --e_workers=1 --e_data-dir=E_DATA_DIR --data=cifar10 --model=vgg8 --augmentation --seed 42 --l_load-checkpoints 1 --train-split=49000 --val-split=1000 --l_gen-strategy="closed-path-train" --l_num-paths=1024 --l_buff-size=30000 --l_closed-path-radius=4 --batch-size=128 --l_num-anchors=8

Results are stored in uncompressed json format to E_SAVE_DIR/E_NAME/MODEL/DATA/TRAINING_SETTING/seed-SEED/OUT_NAME-BATCH_ID.json, where BATCH_ID denotes the statistics computed for batch number BATCH_ID. The size of the json files produced depends on the number of linear regions discovered, and can be in the order of several Gigabytes for large networks, or if many paths are generated. It is advised to compress json files if running several experiments.

Aggregate statistics

Finally, if multiple json files are generated by a single experimental run, to aggregate statistics into a single json file that can be used for plotting, run:

python aggregate_stats.py --model MODEL --data DATA --npaths NPATHS --load-from LOAD_FROM.txt --output OUTPUT --checkpoint-id CHECKPOINT_ID --dataset-split DATASET_SPLIT

where LOAD_FROM.txt is a plaintext file listing the json results generated by compute_stats.py, with each path to a json file separated by a newline character.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
core		core
interface		interface
models		models
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
aggregate_stats.py		aggregate_stats.py
compute_stats.py		compute_stats.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core

core

interface

interface

models

models

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

aggregate_stats.py

aggregate_stats.py

compute_stats.py

compute_stats.py

requirements.txt

requirements.txt

train.py

train.py

Repository files navigation

Are All Linear Regions Created Equal?

Set up

Configure environment

Train a model

Compute statistics

Aggregate statistics

About

Releases

Packages

Languages

License

magamba/linear-regions

Folders and files

Latest commit

History

Repository files navigation

Are All Linear Regions Created Equal?

Set up

Configure environment

Train a model

Compute statistics

Aggregate statistics

About

Resources

License

Stars

Watchers

Forks

Languages